ALI Performance Tests on Weaver

Introduction

Currently testing the Greenland Ice Sheet (GIS) in Albany Land Ice (ALI) using Nvidia Tesla V100 GPUs on weaver. Note: Waterman was moved to a different network and renamed weaver. The last waterman test was executed on 6/16/2020.

Architectures:

Name Weaver (P9/V100)
CPU Dual-socket IBM POWER9
GPU Nvidia Tesla V100
Cores/Node 40
Threads/Core 8
GPUs/Node 4
Memory/Node 319 GB
Interconnect Mellanox EDR IB (100 GB/s)
Compiler gcc 7.2.0
GPU Compiler cuda 10.1.105
MPI openmpi 4.0.1

Cases:

Case Name Number of Processes (np) Description
green-1-10km_ent_fea_mem_tet 8 Unstructured 1-10km GIS, Enthalpy problem, finite element assembly only, memoization, tetrahedron
green-1-7km_fea_1ws 8 Unstructured 1-7km GIS, finite element assembly only, single workset
green-1-7km_fea_mem 8 Unstructured 1-7km GIS, finite element assembly only, memoization
green-1-7km_muk_ls_mem 8 Unstructured 1-7km GIS, MueLu w/ kokkos and line smoothing, memoization
green-3-20km_vel_fea_mem_tet 8 Unstructured 3-20km GIS, Velocity problem, finite element assembly only, memoization, tetrahedron
green-3-20km_vel_fea_mem_wdg 8 Unstructured 3-20km GIS, Velocity problem, finite element assembly only, memoization, wedge
green-3-20km_ent_fea_mem_tet 8 Unstructured 3-20km GIS, Enthalpy problem, finite element assembly only, memoization, tetrahedron
green-3-20km_ent_fea_mem_wdg 8 Unstructured 3-20km GIS, Enthalpy problem, finite element assembly only, memoization, wedge

Timers:

Timer Name Level Description
Albany Total Time 0 Total wall-clock time of simulation
Albany: Setup Time 1 Preprocess
Albany: Total Fill Time 1 Finite element assembly
Albany Fill: Residual 2 Residual assembly
Albany Residual Fill: Evaluate 3 Compute the residual, local/global assembly
Albany Residual Fill: Export 3 Update global residual across MPI ranks
Albany Fill: Jacobian 2 Jacobian assembly
Albany Jacobian Fill: Evaluate 3 Compute the Jacobian, local/global assembly
Albany Jacobian Fill: Export 3 Update global Jacobian across MPI ranks
NOX Total Preconditioner Construction 1 Construct Preconditioner
NOX Total Linear Solve 1 Linear Solve
In [1]:
import datetime as dt
import glob
import numpy as np
import pandas as pd
import json
import multiprocessing
import sys

import plotly.graph_objects as go
from plotly.offline import iplot, init_notebook_mode

# Import scripts
sys.path.insert(0,'kcshan-perf-analysis')
from json2timeline import json2dataframe
from models import single_ts_chgpts
from basicstats import add_regime_stats
from utils import * 
In [2]:
hide_code_button()
Out[2]:
In [3]:
# Enable offline plot export
init_notebook_mode(connected=True)

Specifications

In [4]:
# Load configuration file
with open('config.json') as jf:
    config = json.load(jf)
check_config(config)
for key,val in config.items():
        exec(key + '=val')

# Extract file names and collect data
files = glob.glob(json_regex)
df = json2dataframe(files, cases, names, timers, metadata)
if len(df) == 0:
    raise RuntimeError('No data found; check json directory')

# Log-transform the data before modeling
xform = lambda x: np.log(x)
inv_xform = lambda x: np.exp(x)

# Add other metrics to name list
names.append('max host memory')
names.append('max kokkos memory')
metadata.remove('max host memory')
metadata.remove('max kokkos memory')

# # Filter data by date if desired
# import datetime as dt
# df = df[df['date'] < dt.datetime.strptime('20191231', '%Y%m%d')]
In [5]:
print('Test cases:')
[print('  '+c) for c in cases]
print('Metrics:')
[print('  '+n) for n in names]
print('Metadata:')
[print('  '+m) for m in metadata]
print("Model threshold: %f" % threshold)
Test cases:
  green-1-10km_ent_fea_1ws_tet_np8
  green-1-10km_ent_fea_mem_tet_np8
  green-1-10km_ent_muk_tet_np8
  green-1-7km_fea_1ws_np8
  green-1-7km_fea_mem_np8
  green-1-7km_muk_ls_mem_np8
  green-3-20km_vel_fea_mem_tet_np8
  green-3-20km_vel_fea_mem_wdg_np8
  green-3-20km_vel_muk_wdg_np8
  green-3-20km_ent_fea_mem_tet_np8
  green-3-20km_ent_fea_mem_wdg_np8
  green-3-20km_ent_muk_wdg_np8
Metrics:
  Total Time
  Setup Time
  Total Fill Time
  Residual Fill
  Residual Fill Evaluate
  Residual Fill Export
  Jacobian Fill
  Jacobian Fill Evaluate
  Jacobian Fill Export
  NOX Total Preconditioner Construction
  NOX Total Linear Solve
  max host memory
  max kokkos memory
Metadata:
  Albany cxx compiler
  Albany git commit id
  Trilinos git commit id
Model threshold: 0.005000

Performance Timelines

In [6]:
np.seterr(all='raise') 

# Find changepoints and format data to work nicely with plots
seqs = {case:{} for case in cases}
most_recent = df['date'].max()
events = {}
pool = multiprocessing.Pool(4)

print('Finding changepoints:')
for case in cases:
    print(case)
    
    # Add time data to seqs
    for name in names:
        cols = ['date', name] + list(metadata)
        data = df.loc[df['case']==case, cols].dropna(subset=[name])
        data.reset_index(drop=True, inplace=True)
        data.rename(columns={name:'time'}, inplace=True)
        data['time'] = xform(data['time'])
        seqs[case][name] = data
    
    # Detect changepoints  
    pool_inputs = [(k, v, threshold) for k,v in seqs[case].items()]
    chgpts = dict(pool.map(single_ts_chgpts, pool_inputs))
    
    for name in names:
        # Calculate mean/std between changepoints
        seqs[case][name] = add_regime_stats(seqs[case][name], chgpts[name])
        
        # Build dictionary of changepoints
        for d in seqs[case][name]['date'].iloc[chgpts[name]]:
            events.setdefault(d, {}).setdefault(case, []).append(name)
clear_output()

# Sort and print results
events = {k:events[k] for k in sorted(events.keys())}
print('Events in the most recent %d days:' % recency)
recent_events = print_events(events, most_recent, recency)
Events in the most recent 10 days:
05/28/2021:
    green-1-10km_ent_muk_tet_np8: max host memory
    green-1-7km_fea_1ws_np8: Total Time
                             Total Fill Time
                             Residual Fill
                             Jacobian Fill
                             Jacobian Fill Evaluate
                             Jacobian Fill Export
    green-1-7km_fea_mem_np8: Total Time
    green-1-7km_muk_ls_mem_np8: Total Time
                                NOX Total Preconditioner Construction
                                NOX Total Linear Solve
    green-3-20km_vel_fea_mem_tet_np8: NOX Total Linear Solve
    green-3-20km_vel_fea_mem_wdg_np8: NOX Total Linear Solve
    green-3-20km_ent_fea_mem_tet_np8: NOX Total Linear Solve
    green-3-20km_ent_fea_mem_wdg_np8: NOX Total Linear Solve
    green-3-20km_ent_muk_wdg_np8: max host memory
05/29/2021:
    green-1-10km_ent_fea_1ws_tet_np8: Total Time
                                      Total Fill Time
                                      Residual Fill
                                      Residual Fill Evaluate
                                      Jacobian Fill
                                      Jacobian Fill Evaluate
                                      NOX Total Linear Solve
                                      max host memory
                                      max kokkos memory
    green-1-10km_ent_fea_mem_tet_np8: Total Time
                                      Total Fill Time
                                      Residual Fill
                                      Residual Fill Evaluate
                                      Residual Fill Export
                                      Jacobian Fill
                                      Jacobian Fill Evaluate
                                      Jacobian Fill Export
                                      max kokkos memory
    green-1-10km_ent_muk_tet_np8: Total Fill Time
                                  Residual Fill
                                  Residual Fill Evaluate
                                  Jacobian Fill
                                  Jacobian Fill Evaluate
                                  Jacobian Fill Export
                                  max kokkos memory
    green-1-7km_fea_1ws_np8: Total Fill Time
                             Jacobian Fill
                             NOX Total Linear Solve
                             max host memory
                             max kokkos memory
    green-1-7km_fea_mem_np8: Total Time
                             Total Fill Time
                             Residual Fill
                             Residual Fill Evaluate
                             Residual Fill Export
                             Jacobian Fill
                             Jacobian Fill Evaluate
                             Jacobian Fill Export
                             max host memory
                             max kokkos memory
    green-1-7km_muk_ls_mem_np8: Total Fill Time
                                Residual Fill
                                Residual Fill Evaluate
                                Residual Fill Export
                                Jacobian Fill
                                Jacobian Fill Evaluate
                                Jacobian Fill Export
                                max host memory
                                max kokkos memory
    green-3-20km_vel_fea_mem_tet_np8: Total Fill Time
                                      Residual Fill
                                      Jacobian Fill
                                      max kokkos memory
    green-3-20km_vel_fea_mem_wdg_np8: Total Fill Time
                                      Residual Fill
                                      Jacobian Fill
                                      Jacobian Fill Evaluate
                                      max host memory
                                      max kokkos memory
    green-3-20km_vel_muk_wdg_np8: Total Fill Time
                                  Residual Fill
                                  Residual Fill Evaluate
                                  Jacobian Fill
                                  Jacobian Fill Evaluate
                                  max host memory
                                  max kokkos memory
    green-3-20km_ent_fea_mem_tet_np8: Residual Fill
                                      max kokkos memory
    green-3-20km_ent_fea_mem_wdg_np8: max kokkos memory
    green-3-20km_ent_muk_wdg_np8: max host memory
                                  max kokkos memory
05/30/2021:
    green-3-20km_ent_fea_mem_tet_np8: Jacobian Fill Evaluate
05/31/2021:
    green-3-20km_vel_fea_mem_tet_np8: Jacobian Fill Evaluate
    green-3-20km_vel_muk_wdg_np8: Total Time
                                  Setup Time
    green-3-20km_ent_fea_mem_tet_np8: Residual Fill Evaluate
06/01/2021:
    green-3-20km_ent_fea_mem_wdg_np8: Residual Fill
06/03/2021:
    green-3-20km_ent_fea_mem_tet_np8: max host memory
06/04/2021:
    green-1-10km_ent_fea_1ws_tet_np8: NOX Total Preconditioner Construction
                                      max kokkos memory
    green-1-10km_ent_fea_mem_tet_np8: Jacobian Fill
                                      NOX Total Preconditioner Construction
                                      max kokkos memory
    green-1-10km_ent_muk_tet_np8: max kokkos memory
    green-1-7km_fea_1ws_np8: max kokkos memory
    green-1-7km_fea_mem_np8: max kokkos memory
    green-3-20km_vel_fea_mem_tet_np8: Jacobian Fill Export
                                      max kokkos memory
    green-3-20km_vel_fea_mem_wdg_np8: Jacobian Fill Export
                                      max kokkos memory
    green-3-20km_vel_muk_wdg_np8: NOX Total Preconditioner Construction
                                  NOX Total Linear Solve
    green-3-20km_ent_fea_mem_tet_np8: Total Time
                                      Total Fill Time
                                      Jacobian Fill
                                      Jacobian Fill Export
                                      max kokkos memory
    green-3-20km_ent_fea_mem_wdg_np8: Total Time
                                      Total Fill Time
                                      Jacobian Fill
                                      Jacobian Fill Export
                                      max kokkos memory
    green-3-20km_ent_muk_wdg_np8: Jacobian Fill
                                  Jacobian Fill Export
                                  NOX Total Preconditioner Construction
                                  NOX Total Linear Solve
In [7]:
lines = ['time', 'mean', 'upper', 'lower']
colors = ['blue', 'red', 'red', 'red']
modes = ['markers', 'lines', 'lines', 'lines']
dashes = ['solid', 'solid', 'dot', 'dot']

fig = go.Figure()
# Create series on plot
for line, color, mode, dash in zip(lines, colors, modes, dashes):
    for c in cases:
        if line == 'time':
            fig.add_trace(go.Scatter(
                x=seqs[c][names[0]]['date'],
                y=inv_xform(seqs[c][names[0]][line]),
                mode=mode,
                line = dict(color=color, dash=dash, width=1.5),
                name=line,
                visible=True if c==cases[0] else False,
                customdata=seqs[c][names[0]][['date']+list(metadata)],
                hovertemplate=
                "Date: %{customdata[0]}<br>" +
#                 "Albany compiler: %{customdata[1]}<br>" +
                "Albany commit: %{customdata[2]}<br>" +
                "Trilinos commit: %{customdata[3]}" +
                "<extra></extra>",
            ))
        else:
            fig.add_trace(go.Scatter(
                x=seqs[c][names[0]]['date'],
                y=inv_xform(seqs[c][names[0]][line]),
                mode=mode,
                line = dict(color=color, dash=dash, width=1.5),
                name=line,
                visible=True if c==cases[0] else False,
                hoverinfo='skip'
            ))

changed_cases = {n for v in recent_events.values() for n in v.keys()}

# Test case dropdown
case_options = [dict(
        args=['visible', [True if x==c else False for x in np.tile(cases, len(lines))]],
        label= '*'+c if c in changed_cases else c,
        method='restyle'
    ) for c in cases]
    
# Timer dropdown
name_options = [dict(
        args=[{'x': [seqs[c][n]['date'] for _ in lines for c in cases],
               'y': [inv_xform(seqs[c][n][line]) for line in lines for c in cases],
               'customdata': [seqs[c][n][['date']+list(metadata)].to_numpy()
                              if line == 'time' else None
                              for line in lines for c in cases]}],
        label=n,
        method='restyle'
    ) for n in names]

# Add dropdowns to plot
fig.update_layout(
    updatemenus=[
        go.layout.Updatemenu(
            buttons=list(case_options),
            direction="down",
            pad={"r": 10, "t": 10},
            showactive=True,
            x=0,    xanchor="left",
            y=1.15, yanchor="top"
        ),
        go.layout.Updatemenu(
            buttons=list(name_options),
            direction="down",
            pad={"r": 10, "t": 10},
            showactive=True,
            x=0.3, xanchor="left",
            y=1.15, yanchor="top"
        ),
    ],
    margin={'l': 50, 'r': 50, 'b': 200, 't': 50},
    height=600,
    xaxis_title='Simulation Date',
    yaxis_title='Wall-clock Time (s) or Memory (MiB)'
)

iplot(fig)

Plot of wall-clock times or memory for nightly runs

Changepoints are estimated using a generalized likelihood ratio method on each timer, and then merged over all timers for a given test case.

  • Blue markers: recorded wall-clock time or memory
  • Solid red line: average wall-clock time or memory between changepoints
  • Dotted red lines: average wall-clock time or memory $\pm$ two standard deviations

Plot window controls

  • Test case and timer can be selected from the drop-down menus (* denotes recent changes detected)
  • Hovering over data points shows various metadata
  • Clicking on the legend will show/hide individual plot elements
  • Click and drag to zoom in; double click to reset zoom

Pollak, Moshe; Siegmund, D. Sequential Detection of a Change in a Normal Mean when the Initial Value is Unknown. Ann. Statist. 19 (1991), no. 1, 394--416. doi:10.1214/aos/1176347990. https://projecteuclid.org/euclid.aos/1176347990

Siegmund, D.; Venkatraman, E. S. Using the Generalized Likelihood Ratio Statistic for Sequential Detection of a Change-Point. Ann. Statist. 23 (1995), no. 1, 255--271. doi:10.1214/aos/1176324466. https://projecteuclid.org/euclid.aos/1176324466

Hawkins, D. M., & Zamba, K. D. (2005). Statistical Process Control for Shifts in Mean or Variance using a Change Point Formulation. Technometrics, 47, 164-173.

Hawkins DM, Qiu P, Kang CW. The changepoint model for statistical process control. Journal of Quality Technology. 2003 Oct 1;35(4):355-366.